holdOut function

您所在的位置:网站首页 holdout method holdOut function

holdOut function

#holdOut function| 来源: 网络整理| 查看: 265

The idea of this function is to carry out a hold out experiment of a given learning system on a given data set. The goal of this experiment is to estimate the value of a set of evaluation statistics by means of the hold out method. Hold out estimates are obtained by randomly dividing the given data set in two separate partitions, one that is used for obtaining the prediction model and the other for testing it. This learn+test process is repeated k times. In the end the average of the k scores obtained on each repetition is the hold out estimate.

It is the user responsibility to decide which statistics are to be evaluated on each iteration and how they are calculated. This is done by creating a function that the user knows it will be called by this hold out routine at each repetition of the learn+test process. This user-defined function must assume that it will receive in the first 3 arguments a formula, a training set and a testing set, respectively. It should also assume that it may receive any other set of parameters that should be passed towards the learning algorithm. The result of this user-defined function should be a named vector with the values of the statistics to be estimated obtained by the learner when trained with the given training set, and tested on the given test set. See the Examples section below for an example of these functions.

If the itsInfo parameter is set to the value TRUE then the hldRun object that is the result of the function will have an attribute named itsInfo that will contain extra information from the individual repetitions of the hold out process. This information can be accessed by the user by using the function attr(), e.g. attr(returnedObject,'itsInfo'). For this information to be collected on this attribute the user needs to code its user-defined functions in a way that it returns the vector of the evaluation statistics with an associated attribute named itInfo (note that it is "itInfo" and not "itsInfo" as above), which should be a list containing whatever information the user wants to collect on each repetition. This apparently complex infra-structure allows you to pass whatever information you which from each iteration of the experimental process. A typical example is the case where you want to check the individual predictions of the model on each test case of each repetition. You could pass this vector of predictions as a component of the list forming the attribute itInfo of the statistics returned by your user-defined function. In the end of the experimental process you will be able to inspect/use these predictions by inspecting the attribute itsInfo of the hldRun object returned by the holdOut() function. See the Examples section for an illustration of this potentiality.



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3